Goto

Collaborating Authors

 prediction accuracy


Stein-Encoder: A White-Box Supervised Encoder via Stein Identities in Multi-Modal Studies

arXiv.org Machine Learning

In multi-modal biomedical research, integrating high-dimensional genomic data with clinical baselines is essential for precision medicine. However, standard deep neural network approaches often entangle these modalities, obscuring the specific predictive impact of genetic features and leading to possibly suboptimal predictive performance. Motivated by the landmark METABRIC cohort primary breast tumors study, we propose the Stein-Encoder, a white-box supervised framework designed to isolate the genetic signal driving clinical outcomes conditional on nuisance covariates. By leveraging Stein's method and residualization techniques, our approach constructs an interpretable single index that summarizes relevant biological heterogeneity while flexibly incorporating clinical factors and can be used to improve downstream prediction. We establish theoretical guarantees for identification, consistency and efficiency improvement. Applied to the METABRIC cohort, the Stein-Encoder outperforms unsupervised benchmarks in predictive accuracy. Crucially, it achieves structural disentanglement by revealing response-specific biological mechanisms: we find that tumor size is driven primarily by mitotic networks, whereas prognostic indices rely on a distinct proliferation-versus-immune axis. This work contributes a unified, computationally efficient framework that bridges statistical rigor with the representational power of neural networks, enabling interpretable, task-specific and efficient compression of multi-modal health data for a wide range of precision medicine applications, beyond biomarker discovery.



Re-Think and Re-Design Graph Neural Networks in Spaces of Continuous Graph Diffusion Functionals

Neural Information Processing Systems

S1.1 Step-by-step derivation of min-max optimization in Section 2.2.1 By substituting Eq. 2 into Eq. 1 in the main manuscript, we can obtain the objective function of subscript z (we temporarily drop ifor clarity): J(z) = max Since z might be in high dimensional space, solving such a large system of linear equations under the constraint |z| 1is oftentimes computationally challenging. In order to find a practical solution for z that satisfies the constrained minimization problem in Eq. By setting zl as point of coincidence, we can find a separable majorizer of M(z) by adding the non-negative function (z zl) (ฮฒI Gx Gx)(z zl) (S6) 37th Conference on Neural Information Processing Systems (NeurIPS 2023). Note, to unify the format, we use the matrix transpose property in Eq. Then, the next step is to find z RN that minimizes z z 2bz subject to the constraint |z| 1. Let's first consider the simplest case where z is a scalar: argmin If b 1, then the solution is z = b.



Appendix

Neural Information Processing Systems

In this appendix, we first introduce the datasets and evaluation metrics used in the experiments in Section A. Then, we provide extra experimental results in Section B. In Section C, we present details of network design, training scheme, and hyper-parameter tuning. We conduct experiments on 11 popular time series datasets: (1) Electricity Transformer Temperature [42] (ETTh(1,2),ETTm1) 3consists of 2 year electric power data collected from two separated counties of China. Each data point includes an "oil temperature" value and 6 power load features. The data is aggregated into 5-minutes windows, resulting in 12 points per hour and 288 points per day. A.1 Electricity Transformer Temperature (ETT) For data pre-processing, we perform zero-mean normalization, i.e., X We use Mean Absolute Errors (MAE) [17] and Mean Squared Errors (MSE) [26] for model comparison.



A Local Method for Satisfying Interventional Fairness with Partially Known Causal Graphs

Neural Information Processing Systems

Developing fair automated machine learning algorithms is critical in making safe and trustworthy decisions. Many causality-based fairness notions have been proposed to address the above issues by quantifying the causal connections between sensitive attributes and decisions, and when the true causal graph is fully known, certain algorithms that achieve interventional fairness have been proposed. However, when the true causal graph is unknown, it is still challenging to effectively and efficiently exploit partially directed acyclic graphs (PDAGs) to achieve interventional fairness. To exploit the PDAGs for achieving interventional fairness, previous methods have been built on variable selection or causal effect identification, but limited to reduced prediction accuracy or strong assumptions. In this paper, we propose a general min-max optimization framework that can achieve interventional fairness with promising prediction accuracy and can be extended to maximally oriented PDAGs (MPDAGs) with added background knowledge. Specifically, we first estimate all possible treatment effects of sensitive attributes on a given prediction model from all possible adjustment sets of sensitive attributes via an efficient local approach. Next, we propose to alternatively update the prediction model and possible estimated causal effects, where the prediction model is trained via a min-max loss to control the worst-case fairness violations. Extensive experiments on synthetic and real-world datasets verify the superiority of our methods.


Beyond Accuracy: Ensuring Correct Predictions With Correct Rationales

Neural Information Processing Systems

Large pretrained foundation models demonstrate exceptional performance and, in some high-stakes applications, even surpass human experts. However, most of these models are currently evaluated primarily on prediction accuracy, overlooking the validity of the rationales behind their accurate predictions. For the safe deployment of foundation models, there is a pressing need to ensure,, correct prediction backed by correct rationales. To achieve this, we propose a two-phase scheme: First, we curate a new dataset that offers structured rationales for visual recognition tasks. Second, we propose a rationale-informed optimization method to guide the model in disentangling and localizing visual evidence for each rationale, without requiring manual annotations. Extensive experiments and ablation studies demonstrate that our model outperforms state-of-the-art models by up to 10.1\% in prediction accuracy across a wide range of tasks. Furthermore, our method significantly improves the model's rationale correctness, improving localization by 7.5\% and disentanglement by 36.5\%.